Always process `ChannelMonitorUpdate`s asynchronously #1897

TheBlueMatt · 2022-12-03T05:51:34Z

~~Based on #1887, #1863, and #1886. Will require #1507, I think as well.~~

We currently have two codepaths on most channel update functions -
most methods return a set of messages to send a peer iff the
ChannelMonitorUpdate succeeds, but if it does not we push the
messages back into the Channel and then pull them back out when
the ChannelMonitorUpdate completes and send them then. This adds
a substantial amount of complexity in very critical codepaths.

Instead, here we swap all our channel update codepaths to
immediately set the channel-update-required flag and only return a
ChannelMonitorUpdate to the ChannelManager. Internally in the
Channel we store a queue of ChannelMonitorUpdates, which will
become critical in future work to surface pending
ChannelMonitorUpdates to users at startup so they can complete.

This also lays the groundwork for supporting fully async ChannelMonitor updates by tracking in-flight updates so that we can re-run them on startup if the ChannelManager was persisted but the monitor update did not finish.

TheBlueMatt · 2022-12-03T05:54:15Z

All that to remove 100 LoC in our messaging pipelines :)

$ git diff-tree --stat origin/upstream/main HEAD lightning/src/ln/channel{manager,}.rs
 lightning/src/ln/channel.rs        |  495 ++++++-------
 lightning/src/ln/channelmanager.rs | 1342 ++++++++++++++++--------------------
 2 files changed, 862 insertions(+), 975 deletions(-)

codecov-commenter · 2022-12-03T06:07:55Z

Codecov Report

Base: 87.41% // Head: 87.59% // Increases project coverage by +0.17% 🎉

Coverage data is based on head (d8619ba) compared to base (62a88f9).
Patch coverage: 89.23% of modified lines in pull request are covered.

❗ Current head d8619ba differs from pull request most recent head 2adb8ee. Consider uploading reports for the commit 2adb8ee to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1897      +/-   ##
==========================================
+ Coverage   87.41%   87.59%   +0.17%     
==========================================
  Files         101      101              
  Lines       45633    47683    +2050     
  Branches    45633    47683    +2050     
==========================================
+ Hits        39891    41768    +1877     
- Misses       5742     5915     +173

Impacted Files	Coverage Δ
lightning/src/util/ser.rs	`84.72% <ø> (+1.28%)`	⬆️
lightning/src/ln/channel.rs	`83.99% <85.39%> (+0.36%)`	⬆️
lightning/src/ln/channelmanager.rs	`89.40% <89.49%> (+1.74%)`	⬆️
lightning/src/ln/chanmon_update_fail_tests.rs	`97.22% <93.33%> (-0.10%)`	⬇️
lightning/src/chain/chainmonitor.rs	`93.00% <100.00%> (-0.16%)`	⬇️
lightning/src/ln/functional_tests.rs	`97.33% <100.00%> (+0.04%)`	⬆️
lightning/src/ln/payment_tests.rs	`95.86% <100.00%> (ø)`
lightning/src/util/time.rs	`38.29% <0.00%> (-1.71%)`	⬇️
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

TheBlueMatt · 2022-12-06T21:17:20Z

Should revisit #1886 (comment) as a part of this PR.

TheBlueMatt · 2023-01-12T19:33:46Z

Rebased, Note that the "Always process...async" commits will all need to be squashed down at the end as there are test failures introduced in between them which are fixed by completing the series. They're left separate for now for reviewability.

lightning/src/ln/channelmanager.rs

lightning/src/ln/channel.rs

jkczyz · 2023-02-17T03:25:06Z

lightning/src/ln/channel.rs

+			self.pending_monitor_updates.push(monitor_update);
+			return Ok(self.pending_monitor_updates.last().unwrap());


I see we don't call monitor_updating_paused here unlike most other places. What determines if it should be called?

This branch is in the MonitorUpdateInProgress-already-set case. monitor_update_paused primarily just sets some flags and then also MonitorUpdateInProgress. This if block just manually sets those flags instead. We could use the util but there's not much reason to.

Ah, gotcha. I'm not a big fan of the util, to be honest. Seems the code would be much readable without it, given the amount of unnecessary parameters in some cases.

Yea, definitely agree, guess I just kept using it cause it was already there and does "the thing". I'll remove it in a followup if that's okay.

We haven't had a `MonitorUpdateInProgress` check in `Channel::is_live` for some time.

In the coming commits we'll move to async `ChannelMonitorUpdate` application, which means we'll want to generate a `ChannelMonitorUpdate` (including a new counterparty commitment transaction) before we actually send it to our counterparty. To do that today we'd have to actually sign the commitment transaction by calling the signer, then drop it, apply the `ChannelMonitorUpdate`, then re-sign the commitment transaction to send it to our peer. In this commit we instead split `send_commitment_no_status_check` and `send_commitment_no_state_update` into `build_` and `send_` variants, allowing us to generate new counterparty commitment transactions without actually signing, then build them for sending, with signatures, later.

When a `ChannelMonitor` update completes, we may need to take some further action, such as exposing an `Event` to the user initiating another `ChannelMonitor` update, etc. This commit adds the basic structure to track such actions and serialize them as required. Note that while this does introduce a new map which is written as an even value which users cannot opt out of, the map is only filled in when users use the asynchronous `ChannelMonitor` updates. As these are still considered beta, breaking downgrades for such users is considered acceptable here.

In order to support fully async `ChannelMonitor` updating, we need to ensure that we can replay `ChannelMonitorUpdate`s if we shut down after persisting a `ChannelManager` but without completing a `ChannelMonitorUpdate` persistence. In order to support that we (obviously) have to store the `ChannelMonitorUpdate`s in the `ChannelManager`, which we do here inside the `Channel`. We do so now because in the coming commits we will start using the async persistence flow for all updates, and while we won't yet support fully async monitor updating it's nice to get some of the foundational structures in place now.

Over the next few commits, this macro will replace the `handle_monitor_update_res` macro. It takes a different approach - instead of receiving the message(s) that need to be re-sent after the monitor update completes and pushing them back into the channel, we'll not get the messages from the channel at all until we're ready for them. This will unify our message sending into only actually fetching + sending messages in the common monitor-update-completed code, rather than both there *and* in the functions that call `Channel` when new messages are originated.

TheBlueMatt · 2023-02-17T19:43:23Z

Rebased and added one fixup.

wpaulino · 2023-02-17T22:14:13Z

lightning/src/ln/channel.rs

+
+	/// Only fails in case of signer rejection. Used for channel_reestablish commitment_signed
+	/// generation when we shouldn't change HTLC/channel state.
+	fn send_commitment_no_state_update<L: Deref>(&self, logger: &L) -> Result<(msgs::CommitmentSigned, (Txid, Vec<(HTLCOutputInCommitment, Option<&HTLCSource>)>)), ChannelError> where L::Target: Logger {


Nit: doesn't seem like we use the returned HTLCs anywhere so we could remove them from the return type

Let's do it in a followup? That's an existing thing, no? (plus you're about to rewrite all that code :p)

lightning/src/ln/channelmanager.rs

wpaulino · 2023-02-17T23:33:40Z

lightning/src/ln/channel.rs

-				let (_, mut additional_update) = self.send_commitment_no_status_check(logger).map_err(|e| (None, e))?;
-				// send_commitment_no_status_check may bump latest_monitor_id but we want them to be
+				let mut additional_update = self.build_commitment_no_status_check(logger);
+				// build_commitment_no_status_check may bump latest_monitor_id but we want them to be
 				// strictly increasing by one, so decrement it here.
 				self.latest_monitor_update_id = monitor_update.update_id;


Shouldn't this track the pending queued updates as well, such that we can ensure we only resume the channel after all pending updates complete, rather than having to flap between monitor_paused and monitor_restored after each successful update?

Not quite sure why this comment is on this line, but in general we currently don't per the MonitorEvent::Completed::monitor_update_id docs (which ChainMonitor does implement properly). That's mostly because we previously didn't do any pending monitor update tracking in ChannelManager, but we're going to once we persist the new monitor update queue, so we should probably change this, but it doesn't have to be now.

lightning/src/ln/channelmanager.rs

wpaulino · 2023-02-18T00:02:15Z

lightning/src/ln/channelmanager.rs

+					if let Some(monitor_update) = monitor_update_opt.take() {
+						let update_id = monitor_update.update_id;
+						let update_res = self.chain_monitor.update_channel(funding_txo_opt.unwrap(), monitor_update);
+						break handle_new_monitor_update!(self, update_res, update_id, peer_state_lock, peer_state, chan_entry);


Shouldn't we only break if the monitor update didn't complete?

What's even the purpose of the is_shutdown check below? We can't guarantee the closing negotiation will be done by the time we get there, and we'll only check once as we break out of the loop immediately after. We also already remove the channel once we finish negotiation.

Channel::get_shutdown's docs reference it - May jump to the channel being fully shutdown (see [Self::is_shutdown]) in which case no [ChannelMonitorUpdate] will be returned).. Specifically, if the channel hasn't yet been funded then we'll jump to being fully shutdown, in which case we should just outright remove the channel. However, it looks like those are wrong! The check for whether to generate a monitor update is separate, and in face we definitely should do what the docs say! I've added another commit.

Ah, nice. Only thing now is to ensure we don't queue a shutdown for a channel that has yet to be funded, which is a requirement in BOLT-2, but feel free to leave that for a follow-up.

lightning/src/ln/functional_tests.rs

TheBlueMatt · 2023-02-19T00:15:09Z

Rebased and addressed feedback.

wpaulino

LGTM

lightning/src/ln/channel.rs

wpaulino · 2023-02-21T22:35:52Z

lightning/src/ln/channelmanager.rs

+					if let Some(monitor_update) = monitor_update_opt.take() {
+						let update_id = monitor_update.update_id;
+						let update_res = self.chain_monitor.update_channel(funding_txo_opt.unwrap(), monitor_update);
+						break handle_new_monitor_update!(self, update_res, update_id, peer_state_lock, peer_state, chan_entry);


Ah, nice. Only thing now is to ensure we don't queue a shutdown for a channel that has yet to be funded, which is a requirement in BOLT-2, but feel free to leave that for a follow-up.

jkczyz

LGTM. Feel free to squash.

jkczyz · 2023-02-22T00:00:44Z

lightning/src/ln/channel.rs

+			self.pending_monitor_updates.push(monitor_update);
+			return Ok(self.pending_monitor_updates.last().unwrap());


Ah, gotcha. I'm not a big fan of the util, to be honest. Seems the code would be much readable without it, given the amount of unnecessary parameters in some cases.

In a previous PR, we added a `MonitorUpdateCompletionAction` enum which described actions to take after a `ChannelMonitorUpdate` persistence completes. At the time, it was only used to execute actions in-line, however in the next commit we'll start (correctly) leaving the existing actions until after monitor updates complete.

The TODO mentioned in `handle_monitor_update_res` about how we might forget about HTLCs in case of permanent monitor update failure still applies in spite of all our changes. If a channel is drop'd in general, monitor-pending updates may be lost if the monitor update failed to persist. This was always the case, and is ultimately the general form of the the specific TODO, so we simply leave comments there

TheBlueMatt · 2023-02-22T00:51:51Z

Squashed the named fixups and fixed the missing doclink. Note that the Always process... commits all need to be squashed before landing, but will wait on some ACKs for that.

wpaulino

Why do those commits need to be squashed?

We currently have two codepaths on most channel update functions - most methods return a set of messages to send a peer iff the `ChannelMonitorUpdate` succeeds, but if it does not we push the messages back into the `Channel` and then pull them back out when the `ChannelMonitorUpdate` completes and send them then. This adds a substantial amount of complexity in very critical codepaths. Instead, here we swap all our channel update codepaths to immediately set the channel-update-required flag and only return a `ChannelMonitorUpdate` to the `ChannelManager`. Internally in the `Channel` we store a queue of `ChannelMonitorUpdate`s, which will become critical in future work to surface pending `ChannelMonitorUpdate`s to users at startup so they can complete. This leaves some redundant work in `Channel` to be cleaned up later. Specifically, we still generate the messages which we will now ignore and regenerate later. This commit updates the `ChannelMonitorUpdate` pipeline across all the places we generate them.

In the previous commit, we moved all our `ChannelMonitorUpdate` pipelines to use a new async path via the `handle_new_monitor_update` macro. This avoids having two message sending pathways and simply sends messages in the "monitor update completed" flow, which is shared between sync and async monitor updates. Here we reuse the new macro for handling `funding_signed` messages when doing an initial `ChannelMonitor` persistence. This provides a similar benefit, simplifying the code a trivial amount, but importantly allows us to fully remove the original `handle_monitor_update_res` macro.

Building on the previous commits, this finishes our transition to doing all message-sending in the monitor update completion pipeline, unifying our immediate- and async- `ChannelMonitor` update and persistence flows.

The `Channel::get_shutdown` docs are very clear - if the channel jumps to `Shutdown` as a result of not being funded when we go to initiate shutdown we should not generate a `ChannelMonitorUpdate` as there's no need to bother with the shutdown script - we're force-closing anyway. However, this wasn't actually implemented, potentially causing a spurious monitor update for no reason.

TheBlueMatt · 2023-02-22T17:36:17Z

Sadly some tests pass in the intermediary commits because the new code assumes that all ChannelMonitorUpdates are in the new Channel monitor update queue, and in the intermediary commits only some of them are. I squashed the monitor update commits with no changes, should be good to land.

TheBlueMatt added this to the 0.0.114 milestone Dec 3, 2022

TheBlueMatt changed the title ~~2022 11 monitor updates always async~~ Always process ChannelMonitorUpdates asynchronously Dec 3, 2022

TheBlueMatt added the blocked on dependent pr label Dec 3, 2022

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from ab58f1c to 4b348e6 Compare December 3, 2022 19:13

TheBlueMatt added the blocked on next release Should Wait Until Next Release To Land label Dec 5, 2022

TheBlueMatt mentioned this pull request Dec 6, 2022

Relock channel_state in for each HTLC in claim_funds and lay the groundwork for async event generation #1886

Merged

TheBlueMatt mentioned this pull request Dec 16, 2022

Store channels per-peer #1507

Merged

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from 4b348e6 to ace0660 Compare December 19, 2022 20:56

TheBlueMatt removed the blocked on next release Should Wait Until Next Release To Land label Dec 21, 2022

TheBlueMatt mentioned this pull request Jan 5, 2023

BaseSign + ChannelMonitor temporary error support #1538

Closed

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch 2 times, most recently from e0bb80d to 809dabc Compare January 12, 2023 19:32

TheBlueMatt removed the blocked on dependent pr label Jan 12, 2023

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from 809dabc to 4217ec3 Compare January 15, 2023 22:44

TheBlueMatt added the Seeking Code Review label Jan 15, 2023

This was referenced Jan 15, 2023

Trivial Serialization Tweaks #1956

Merged

Pass MonitorUpdates by ref and tweak manager lockorder #1957

Merged

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from 4217ec3 to d27d3bd Compare January 15, 2023 23:54

TheBlueMatt added blocked on dependent pr and removed Seeking Code Review blocked on dependent pr labels Jan 16, 2023

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch 3 times, most recently from ac1a550 to 78606e7 Compare January 19, 2023 00:42

jkczyz self-requested a review January 19, 2023 18:20

wpaulino self-requested a review January 19, 2023 23:42

jkczyz reviewed Feb 17, 2023

View reviewed changes

TheBlueMatt added 5 commits February 17, 2023 19:09

Fix the err msg provided when a send fails due to peer disconnected

5be29c6

We haven't had a `MonitorUpdateInProgress` check in `Channel::is_live` for some time.

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from 8ba64e2 to 46d0711 Compare February 17, 2023 19:43

wpaulino reviewed Feb 18, 2023

View reviewed changes

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from cc9a9ef to 0c85c30 Compare February 18, 2023 21:28

wpaulino reviewed Feb 21, 2023

View reviewed changes

jkczyz reviewed Feb 22, 2023

View reviewed changes

TheBlueMatt added 2 commits February 22, 2023 00:51

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from 4e34ff1 to d8619ba Compare February 22, 2023 00:51

wpaulino previously approved these changes Feb 22, 2023

View reviewed changes

jkczyz previously approved these changes Feb 22, 2023

View reviewed changes

TheBlueMatt added 4 commits February 22, 2023 17:34

Use the new monitor persistence flow for funding_created handling

685b08d

Building on the previous commits, this finishes our transition to doing all message-sending in the monitor update completion pipeline, unifying our immediate- and async- `ChannelMonitor` update and persistence flows.

TheBlueMatt dismissed stale reviews from jkczyz and wpaulino via 2adb8ee February 22, 2023 17:34

TheBlueMatt force-pushed the 2022-11-monitor-updates-always-async branch from d8619ba to 2adb8ee Compare February 22, 2023 17:34

wpaulino approved these changes Feb 22, 2023

View reviewed changes

jkczyz approved these changes Feb 22, 2023

View reviewed changes

TheBlueMatt merged commit 96c8507 into lightningdevkit:main Feb 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always process `ChannelMonitorUpdate`s asynchronously #1897

Always process `ChannelMonitorUpdate`s asynchronously #1897

TheBlueMatt commented Dec 3, 2022 •

edited

Loading

TheBlueMatt commented Dec 3, 2022

codecov-commenter commented Dec 3, 2022 •

edited

Loading

TheBlueMatt commented Dec 6, 2022

TheBlueMatt commented Jan 12, 2023

jkczyz Feb 17, 2023

TheBlueMatt Feb 17, 2023

jkczyz Feb 22, 2023

TheBlueMatt Feb 22, 2023

TheBlueMatt commented Feb 17, 2023

wpaulino Feb 17, 2023

TheBlueMatt Feb 18, 2023

wpaulino Feb 17, 2023

TheBlueMatt Feb 18, 2023

wpaulino Feb 18, 2023

wpaulino Feb 18, 2023

TheBlueMatt Feb 19, 2023

wpaulino Feb 21, 2023

TheBlueMatt commented Feb 19, 2023

wpaulino left a comment

wpaulino Feb 21, 2023

jkczyz left a comment

jkczyz Feb 22, 2023

TheBlueMatt commented Feb 22, 2023

wpaulino left a comment

TheBlueMatt commented Feb 22, 2023

		self.pending_monitor_updates.push(monitor_update);
		return Ok(self.pending_monitor_updates.last().unwrap());

Always process ChannelMonitorUpdates asynchronously #1897

Always process ChannelMonitorUpdates asynchronously #1897

Conversation

TheBlueMatt commented Dec 3, 2022 • edited Loading

TheBlueMatt commented Dec 3, 2022

codecov-commenter commented Dec 3, 2022 • edited Loading

Codecov Report

TheBlueMatt commented Dec 6, 2022

TheBlueMatt commented Jan 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Feb 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Feb 19, 2023

wpaulino left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Feb 22, 2023

wpaulino left a comment

Choose a reason for hiding this comment

TheBlueMatt commented Feb 22, 2023

Always process `ChannelMonitorUpdate`s asynchronously #1897

Always process `ChannelMonitorUpdate`s asynchronously #1897

TheBlueMatt commented Dec 3, 2022 •

edited

Loading

codecov-commenter commented Dec 3, 2022 •

edited

Loading